{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Detect Model Bias with Amazon SageMaker Clarify"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "## Amazon Science: _[How Clarify helps machine learning developers detect unintended bias](https://www.amazon.science/latest-news/how-clarify-helps-machine-learning-developers-detect-unintended-bias)_ \n",
    "\n",
    "[<img src=\"img/amazon_science_clarify.png\"  width=\"100%\" align=\"left\">](https://www.amazon.science/latest-news/how-clarify-helps-machine-learning-developers-detect-unintended-bias)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Terminology\n",
    "\n",
    "* **Bias**: \n",
    "An imbalance in the training data or the prediction behavior of the model across different groups, such as age or income bracket. Biases can result from the data or algorithm used to train your model. For instance, if an ML model is trained primarily on data from middle-aged individuals, it may be less accurate when making predictions involving younger and older people.\n",
    "\n",
    "* **Bias metric**: \n",
    "A function that returns numerical values indicating the level of a potential bias.\n",
    "\n",
    "* **Bias report**:\n",
    "A collection of bias metrics for a given dataset, or a combination of a dataset and a model.\n",
    "\n",
    "* **Label**:\n",
    "Feature that is the target for training a machine learning model. Referred to as the observed label or observed outcome.\n",
    "\n",
    "* **Positive label values**:\n",
    "Label values that are favorable to a demographic group observed in a sample. In other words, designates a sample as having a positive result.\n",
    "\n",
    "* **Negative label values**:\n",
    "Label values that are unfavorable to a demographic group observed in a sample. In other words, designates a sample as having a negative result.\n",
    "\n",
    "* **Facet**:\n",
    "A column or feature that contains the attributes with respect to which bias is measured.\n",
    "\n",
    "* **Facet value**:\n",
    "The feature values of attributes that bias might favor or disfavor."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Posttraining Bias Metrics\n",
    "https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-measure-post-training-bias.html\n",
    "\n",
    "* **Difference in Positive Proportions in Predicted Labels (DPPL)**:\n",
    "Measures the difference in the proportion of positive predictions between the favored facet a and the disfavored facet d.\n",
    "\n",
    "* **Disparate Impact (DI)**:\n",
    "Measures the ratio of proportions of the predicted labels for the favored facet a and the disfavored facet d.\n",
    "\n",
    "* **Difference in Conditional Acceptance (DCAcc)**:\n",
    "Compares the observed labels to the labels predicted by a model and assesses whether this is the same across facets for predicted positive outcomes (acceptances).\n",
    "\n",
    "* **Difference in Conditional Rejection (DCR)**:\n",
    "Compares the observed labels to the labels predicted by a model and assesses whether this is the same across facets for negative outcomes (rejections).\n",
    "\n",
    "* **Recall Difference (RD)**:\n",
    "Compares the recall of the model for the favored and disfavored facets.\n",
    "\n",
    "* **Difference in Acceptance Rates (DAR)**:\n",
    "Measures the difference in the ratios of the observed positive outcomes (TP) to the predicted positives (TP + FP) between the favored and disfavored facets.\n",
    "\n",
    "* **Difference in Rejection Rates (DRR)**:\n",
    "Measures the difference in the ratios of the observed negative outcomes (TN) to the predicted negatives (TN + FN) between the disfavored and favored facets.\n",
    "\n",
    "* **Accuracy Difference (AD)**:\n",
    "Measures the difference between the prediction accuracy for the favored and disfavored facets.\n",
    "\n",
    "* **Treatment Equality (TE)**:\n",
    "Measures the difference in the ratio of false positives to false negatives between the favored and disfavored facets.\n",
    "\n",
    "* **Conditional Demographic Disparity in Predicted Labels (CDDPL)**:\n",
    "Measures the disparity of predicted labels between the facets as a whole, but also by subgroups.\n",
    "\n",
    "* **Counterfactual Fliptest (FT)**:\n",
    "Examines each member of facet d and assesses whether similar members of facet a have different model predictions.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import boto3\n",
    "import sagemaker\n",
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "sess = sagemaker.Session()\n",
    "bucket = sess.default_bucket()\n",
    "role = sagemaker.get_execution_role()\n",
    "region = boto3.Session().region_name\n",
    "\n",
    "sm = boto3.Session().client(service_name=\"sagemaker\", region_name=region)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "\n",
    "%matplotlib inline\n",
    "%config InlineBackend.figure_format='retina'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Test data for bias\n",
    "\n",
    "We created test data in JSONLines format to match the model inputs. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_data_bias_path = \"./data-clarify/test_data_bias.jsonl\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!head -n 1 $test_data_bias_path"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Upload the data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "test_data_bias_s3_uri = sess.upload_data(bucket=bucket, key_prefix=\"bias/test_data_bias\", path=test_data_bias_path)\n",
    "test_data_bias_s3_uri"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!aws s3 ls $test_data_bias_s3_uri"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store test_data_bias_s3_uri"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# List Pipeline Execution Steps\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%store -r pipeline_name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(pipeline_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "\n",
    "import time\n",
    "from pprint import pprint\n",
    "\n",
    "executions_response = sm.list_pipeline_executions(PipelineName=pipeline_name)[\"PipelineExecutionSummaries\"]\n",
    "pipeline_execution_status = executions_response[0][\"PipelineExecutionStatus\"]\n",
    "print(pipeline_execution_status)\n",
    "\n",
    "while pipeline_execution_status == \"Executing\":\n",
    "    try:\n",
    "        executions_response = sm.list_pipeline_executions(PipelineName=pipeline_name)[\"PipelineExecutionSummaries\"]\n",
    "        pipeline_execution_status = executions_response[0][\"PipelineExecutionStatus\"]\n",
    "    except Exception as e:\n",
    "        print(\"Please wait...\")\n",
    "        time.sleep(30)\n",
    "\n",
    "pprint(executions_response)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline_execution_status = executions_response[0][\"PipelineExecutionStatus\"]\n",
    "print(pipeline_execution_status)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline_execution_arn = executions_response[0][\"PipelineExecutionArn\"]\n",
    "print(pipeline_execution_arn)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pprint import pprint\n",
    "\n",
    "steps = sm.list_pipeline_execution_steps(PipelineExecutionArn=pipeline_execution_arn)\n",
    "\n",
    "pprint(steps)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# View Created Model\n",
    "_Note:  If the trained model did not pass the Evaluation step (> accuracy threshold), it will not be created._"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for execution_step in steps[\"PipelineExecutionSteps\"]:\n",
    "    if execution_step[\"StepName\"] == \"CreateModel\":\n",
    "        model_arn = execution_step[\"Metadata\"][\"Model\"][\"Arn\"]\n",
    "        break\n",
    "print(model_arn)\n",
    "\n",
    "pipeline_model_name = model_arn.split(\"/\")[-1]\n",
    "print(pipeline_model_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Setup Posttraining Model Bias Analysis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sagemaker import clarify\n",
    "\n",
    "clarify_processor = clarify.SageMakerClarifyProcessor(\n",
    "    role=role, instance_count=1, instance_type=\"ml.c5.2xlarge\", sagemaker_session=sess\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Writing DataConfig and ModelConfig\n",
    "A `DataConfig` object communicates some basic information about data I/O to Clarify. We specify where to find the input dataset, where to store the output, the target column (`label`), the header names, and the dataset type.\n",
    "\n",
    "Similarly, the `ModelConfig` object communicates information about your trained model and `ModelPredictedLabelConfig` provides information on the format of your predictions.  \n",
    "\n",
    "**Note**: To avoid additional traffic to your production models, SageMaker Clarify sets up and tears down a dedicated endpoint when processing. `ModelConfig` specifies your preferred instance type and instance count used to run your model on during Clarify's processing."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## DataConfig"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bias_report_prefix = \"bias/report-{}\".format(pipeline_model_name)\n",
    "\n",
    "bias_report_output_path = \"s3://{}/{}\".format(bucket, bias_report_prefix)\n",
    "\n",
    "data_config = clarify.DataConfig(\n",
    "    s3_data_input_path=test_data_bias_s3_uri,\n",
    "    s3_output_path=bias_report_output_path,\n",
    "    label=\"star_rating\",\n",
    "    features=\"features\",\n",
    "    # label must be last, features in exact order as passed into model\n",
    "    headers=[\"review_body\", \"product_category\", \"star_rating\"],\n",
    "    dataset_type=\"application/jsonlines\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## ModelConfig"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "model_config = clarify.ModelConfig(\n",
    "    model_name=pipeline_model_name,\n",
    "    instance_type=\"ml.m5.4xlarge\",\n",
    "    instance_count=1,\n",
    "    content_type=\"application/jsonlines\",\n",
    "    accept_type=\"application/jsonlines\",\n",
    "    # {\"features\": [\"the worst\", \"Digital_Software\"]}\n",
    "    content_template='{\"features\":$features}',\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## _Note: `label` is set to the JSON key for the model prediction results_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictions_config = clarify.ModelPredictedLabelConfig(label=\"predicted_label\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## BiasConfig"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bias_config = clarify.BiasConfig(\n",
    "    label_values_or_threshold=[\n",
    "        5,\n",
    "        4,\n",
    "    ],  # needs to be int or str for continuous dtype, needs to be >1 for categorical dtype\n",
    "    facet_name=\"product_category\",\n",
    "    #                                facet_values_or_threshold=['Gift Card'],\n",
    "    group_name=\"product_category\",\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Run Clarify Job"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "clarify_processor.run_post_training_bias(\n",
    "    data_config=data_config,\n",
    "    data_bias_config=bias_config,\n",
    "    model_config=model_config,\n",
    "    model_predicted_label_config=predictions_config,\n",
    "    #    methods='all', # FlipTest requires all columns to be numeric\n",
    "    methods=[\"DPPL\", \"DI\", \"DCA\", \"DCR\", \"RD\", \"DAR\", \"DRR\", \"AD\", \"CDDPL\", \"TE\"],\n",
    "    wait=False,\n",
    "    logs=False,\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "run_post_training_bias_processing_job_name = clarify_processor.latest_job.job_name\n",
    "run_post_training_bias_processing_job_name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.core.display import display, HTML\n",
    "\n",
    "display(\n",
    "    HTML(\n",
    "        '<b>Review <a target=\"blank\" href=\"https://console.aws.amazon.com/sagemaker/home?region={}#/processing-jobs/{}\">Processing Job</a></b>'.format(\n",
    "            region, run_post_training_bias_processing_job_name\n",
    "        )\n",
    "    )\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.core.display import display, HTML\n",
    "\n",
    "display(\n",
    "    HTML(\n",
    "        '<b>Review <a target=\"blank\" href=\"https://console.aws.amazon.com/cloudwatch/home?region={}#logStream:group=/aws/sagemaker/ProcessingJobs;prefix={};streamFilter=typeLogStreamPrefix\">CloudWatch Logs</a> After About 5 Minutes</b>'.format(\n",
    "            region, run_post_training_bias_processing_job_name\n",
    "        )\n",
    "    )\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.core.display import display, HTML\n",
    "\n",
    "display(\n",
    "    HTML(\n",
    "        '<b>Review <a target=\"blank\" href=\"https://s3.console.aws.amazon.com/s3/buckets/{}?prefix={}/\">S3 Output Data</a> After The Processing Job Has Completed</b>'.format(\n",
    "            bucket, bias_report_prefix\n",
    "        )\n",
    "    )\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from pprint import pprint\n",
    "\n",
    "running_processor = sagemaker.processing.ProcessingJob.from_processing_name(\n",
    "    processing_job_name=run_post_training_bias_processing_job_name, sagemaker_session=sess\n",
    ")\n",
    "\n",
    "processing_job_description = running_processor.describe()\n",
    "\n",
    "pprint(processing_job_description)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "running_processor.wait(logs=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Download Report From S3"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!aws s3 ls $bias_report_output_path/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!aws s3 cp --recursive $bias_report_output_path ./generated_bias_report/"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.core.display import display, HTML\n",
    "\n",
    "display(HTML('<b>Review <a target=\"blank\" href=\"./generated_bias_report/report.html\">Bias Report</a></b>'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# View Bias Report in Studio\n",
    "In Studio, you can view the results under the experiments tab.\n",
    "\n",
    "<img src=\"img/bias_report.gif\">\n",
    "\n",
    "Each bias metric has detailed explanations with examples that you can explore.\n",
    "\n",
    "<img src=\"img/bias_detail.gif\">\n",
    "\n",
    "You could also summarize the results in a handy table!\n",
    "\n",
    "<img src=\"img/bias_report_chart.gif\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Release Resources"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%html\n",
    "\n",
    "<p><b>Shutting down your kernel for this notebook to release resources.</b></p>\n",
    "<button class=\"sm-command-button\" data-commandlinker-command=\"kernelmenu:shutdown\" style=\"display:none;\">Shutdown Kernel</button>\n",
    "        \n",
    "<script>\n",
    "try {\n",
    "    els = document.getElementsByClassName(\"sm-command-button\");\n",
    "    els[0].click();\n",
    "}\n",
    "catch(err) {\n",
    "    // NoOp\n",
    "}    \n",
    "</script>"
   ]
  }
 ],
 "metadata": {
  "instance_type": "ml.t3.medium",
  "kernelspec": {
   "display_name": "Python 3 (Data Science)",
   "language": "python",
   "name": "python3__SAGEMAKER_INTERNAL__arn:aws:sagemaker:us-east-1:081325390199:image/datascience-1.0"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
